Shahmukhi to Gurmukhi Transliteration System

نویسندگان

  • Tejinder Singh Saini
  • Gurpreet Singh Lehal
  • Virinder S. Kalra
چکیده

The existence of two scripts for Punjabi language has created a script barrier between the Punjabi literature written in India and Pakistan. This research has developed a new system for the first time of its kind for Shahmukhi text without diacritical marks. The purposed system for Shahmukhi to Gurmukhi transliteration has been implemented with various research techniques based on language corpus. The corpus analysis of both scripts is performed for generating statistical data of different types like character and word frequencies and bi-gram frequencies. This statistical analysis is used in different phases of transliteration. Potentially, all members of the substantial Punjabi community will benefit vastly from this transliteration system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shahmukhi to Gurmukhi Transliteration System: A Corpus based Approach

This research paper describes a corpus based transliteration system for Punjabi language. The existence of two scripts for Punjabi language has created a script barrier between the Punjabi literature written in India and in Pakistan. This research project has developed a new system for the first time of its kind for Shahmukhi script of Punjabi language. The proposed system for Shahmukhi to Gurm...

متن کامل

Conversion between Scripts of Punjabi: Beyond Simple Transliteration

This paper describes statistical techniques used for modelling transliteration systems between the scripts of Punjabi language. Punjabi is one of the unique languages, which are written in more than one script. In India, Punjabi is written in Gurmukhi script, while in Pakistan it is written in Shahmukhi (Perso-Arabic) script. Shahmukhi script has its origin in the ancient Phoenician script wher...

متن کامل

An Omni-Font Gurmukhi to Shahmukhi Transliteration System

This paper describes a font independent Gurmukhi-to-Shahmukhi transliteration system. Even though Unicode is gaining popularity, but still there is lot of material in Punjabi, which is available in ASCII based fonts. A problem with ASCII fonts for Punjabi is there is no standardisation of mapping of Punjabi characters and a Gurmukhi character may be internally mapped to different keys in differ...

متن کامل

Punjabi Machine Transliteration

Machine Transliteration is to transcribe a word written in a script with approximate phonetic equivalence in another language. It is useful for machine translation, cross-lingual information retrieval, multilingual text and speech processing. Punjabi Machine Transliteration (PMT) is a special case of machine transliteration and is a process of converting a word from Shahmukhi (based on Arabic s...

متن کامل

A Transliteration based Word Segmentation System for Shahmukhi Script

Word Segmentation is an important prerequisite for almost all Natural Language Processing (NLP) applications. Since word is a fundamental unit of any language, almost every NLP system first needs to segment input text into a sequence of words before further processing. In this paper, Shahmukhi word segmentation has been discussed in detail. The presented word segmentation module is part of Shah...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008